<!DOCTYPE html>
<html prefix="og: http://ogp.me/ns# article: http://ogp.me/ns/article# " lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>regex | 绿萝间</title>
<link href="../assets/css/all-nocdn.css" rel="stylesheet" type="text/css">
<link href="../assets/css/ipython.min.css" rel="stylesheet" type="text/css">
<link href="../assets/css/nikola_ipython.css" rel="stylesheet" type="text/css">
<meta name="theme-color" content="#5670d4">
<meta name="generator" content="Nikola (getnikola.com)">
<link rel="alternate" type="application/rss+xml" title="RSS" href="../rss.xml">
<link rel="canonical" href="https://muxuezi.github.io/posts/regex.html">
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
    tex2jax: {
        inlineMath: [ ['$','$'], ["\\(","\\)"] ],
        displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
        processEscapes: true
    },
    displayAlign: 'center', // Change this to 'center' to center equations.
    "HTML-CSS": {
        styles: {'.MathJax_Display': {"margin": 0}}
    }
});
</script><!--[if lt IE 9]><script src="../assets/js/html5.js"></script><![endif]--><meta name="author" content="Tao Junjie">
<link rel="prev" href="ruby-on-rails-with-ruby2-0.html" title="ruby-on-rails-with-ruby2-0" type="text/html">
<link rel="next" href="raw_input.html" title="raw_input" type="text/html">
<meta property="og:site_name" content="绿萝间">
<meta property="og:title" content="regex">
<meta property="og:url" content="https://muxuezi.github.io/posts/regex.html">
<meta property="og:description" content="用于字符串处理、表单验证等场合，实用高效。现将一些常用的表达式收集于此，以备不时之需。
匹配中文字符的正则表达式： [u4e00-u9fa5]
评注：匹配中文还真是个头疼的事，有了这个表达式就好办了
匹配双字节字符(包括汉字在内)：[^x00-xff]
评注：可以用来计算字符串的长度（一个双字节字符长度计2，ASCII字符计1）
匹配空白行的正则表达式：ns*r
评注：可以用来删除空白行
匹配HT">
<meta property="og:type" content="article">
<meta property="article:published_time" content="2010-01-01T13:44:27+08:00">
<meta property="article:tag" content="Linux">
<meta property="article:tag" content="WordPress">
</head>
<body>
<a href="#content" class="sr-only sr-only-focusable">Skip to main content</a>

<!-- Menubar -->

<nav class="navbar navbar-inverse navbar-static-top"><div class="container">
<!-- This keeps the margins nice -->
        <div class="navbar-header">
            <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#bs-navbar" aria-controls="bs-navbar" aria-expanded="false">
            <span class="sr-only">Toggle navigation</span>
            <span class="icon-bar"></span>
            <span class="icon-bar"></span>
            <span class="icon-bar"></span>
            </button>
            <a class="navbar-brand" href="https://muxuezi.github.io/">

                <span id="blog-title">绿萝间</span>
            </a>
        </div>
<!-- /.navbar-header -->
        <div class="collapse navbar-collapse" id="bs-navbar" aria-expanded="false">
            <ul class="nav navbar-nav">
<li>
<a href="../archive.html">Archive</a>
                </li>
<li>
<a href="../categories/">Tags</a>
                </li>
<li>
<a href="../rss.xml">RSS feed</a>

                
            </li>
</ul>
<ul class="nav navbar-nav navbar-right">
<li>
    <a href="regex.wp" id="sourcelink">Source</a>
    </li>

                
            </ul>
</div>
<!-- /.navbar-collapse -->
    </div>
<!-- /.container -->
</nav><!-- End of Menubar --><div class="container" id="content" role="main">
    <div class="body-content">
        <!--Body content-->
        <div class="row">
            
            
<article class="post-text h-entry hentry postpage" itemscope="itemscope" itemtype="http://schema.org/Article"><header><h1 class="p-name entry-title" itemprop="headline name"><a href="#" class="u-url">regex</a></h1>

        <div class="metadata">
            <p class="byline author vcard"><span class="byline-name fn">
                    Tao Junjie
            </span></p>
            <p class="dateline"><a href="#" rel="bookmark"><time class="published dt-published" datetime="2010-01-01T13:44:27+08:00" itemprop="datePublished" title="2010-01-01 13:44">2010-01-01 13:44</time></a></p>
            
        <p class="sourceline"><a href="regex.wp" id="sourcelink">Source</a></p>

        </div>
        

    </header><div class="e-content entry-content" itemprop="articleBody text">
    <div>
<p></p>
<p>用于字符串处理、表单验证等场合，实用高效。现将一些常用的表达式收集于此，以备不时之需。</p>
<p>匹配中文字符的正则表达式： [u4e00-u9fa5]</p>
<p>评注：匹配中文还真是个头疼的事，有了这个表达式就好办了</p>
<p>匹配双字节字符(包括汉字在内)：[^x00-xff]</p>
<p>评注：可以用来计算字符串的长度（一个双字节字符长度计2，ASCII字符计1）</p>
<p>匹配空白行的正则表达式：ns*r</p>
<p>评注：可以用来删除空白行</p>
<p>匹配HTML标记的正则表达式：]<em>&gt;.</em>?1&gt;|&lt;.&gt;</p>
<p>评注：网上流传的版本太糟糕，上面这个也仅仅能匹配部分，对于复杂的嵌套标记依旧无能为力</p>
<p>匹配首尾空白字符的正则表达式：^s<em>|s</em>$</p>
<p>评注：可以用来删除行首行尾的空白字符(包括空格、制表符、换页符等等)，非常有用的表达式</p>
<p>匹配Email地址的正则表达式：w+([-+.]w+)<em>@w+([-.]w+)</em>.w+([-.]w+)*</p>
<p>评注：表单验证时很实用</p>
<p>匹配网址URL的正则表达式：[a-zA-z]+://[^s]*</p>
<p>评注：网上流传的版本功能很有限，上面这个基本可以满足需求</p>
<p>匹配帐号是否合法(字母开头，允许5-16字节，允许字母数字下划线)：^[a-zA-Z][a-zA-Z0-9_]{4,15}$</p>
<p>评注：表单验证时很实用</p>
<p>匹配国内电话号码：d{3}-d{8}|d{4}-d{7}</p>
<p>评注：匹配形式如 0511-4405222 或 021-87888822</p>
<p>匹配腾讯QQ号：[1-9][0-9]{4,}</p>
<p>评注：腾讯QQ号从10000开始</p>
<p>匹配中国邮政编码：[1-9]d{5}(?!d)</p>
<p>评注：中国邮政编码为6位数字</p>
<p>匹配身份证：d{15}|d{18}</p>
<p>评注：中国的身份证为15位或18位</p>
<p>匹配ip地址：d+.d+.d+.d+</p>
<p>评注：提取ip地址时有用</p>
<p>匹配特定数字：</p>
<p>^[1-9]d*$　 　 //匹配正整数</p>
<p>^-[1-9]d*$ 　 //匹配负整数</p>
<p>^-?[1-9]d*$　　 //匹配整数</p>
<p>^[1-9]d*|0$　 //匹配非负整数（正整数 + 0）</p>
<p>^-[1-9]d*|0$　　 //匹配非正整数（负整数 + 0）</p>
<p>^[1-9]d<em>.d</em>|0.d<em>[1-9]d</em>$　　 //匹配正浮点数</p>
<p>^-([1-9]d<em>.d</em>|0.d<em>[1-9]d</em>)$　 //匹配负浮点数</p>
<p>^-?([1-9]d<em>.d</em>|0.d<em>[1-9]d</em>|0?.0+|0)$　 //匹配浮点数</p>
<p>^[1-9]d<em>.d</em>|0.d<em>[1-9]d</em>|0?.0+|0$　　 //匹配非负浮点数（正浮点数 + 0）</p>
<p>^(-([1-9]d<em>.d</em>|0.d<em>[1-9]d</em>))|0?.0+|0$　　//匹配非正浮点数（负浮点数 + 0）</p>
<p>评注：处理大量数据时有用，具体应用时注意修正</p>
<p>匹配特定字符串：</p>
<p>^[A-Za-z]+$　　//匹配由26个英文字母组成的字符串</p>
<p>^[A-Z]+$　　//匹配由26个英文字母的大写组成的字符串</p>
<p>^[a-z]+$　　//匹配由26个英文字母的小写组成的字符串</p>
<p>^[A-Za-z0-9]+$　　//匹配由数字和26个英文字母组成的字符串</p>
<p>^w+$　　//匹配由数字、26个英文字母或者下划线组成的字符串</p>
</div>
    </div>
    <aside class="postpromonav"><nav><ul itemprop="keywords" class="tags">
<li><a class="tag p-category" href="../categories/linux.html" rel="tag">Linux</a></li>
            <li><a class="tag p-category" href="../categories/wordpress.html" rel="tag">WordPress</a></li>
        </ul>
<ul class="pager hidden-print">
<li class="previous">
                <a href="ruby-on-rails-with-ruby2-0.html" rel="prev" title="ruby-on-rails-with-ruby2-0">Previous post</a>
            </li>
            <li class="next">
                <a href="raw_input.html" rel="next" title="raw_input">Next post</a>
            </li>
        </ul></nav></aside></article>
</div>
        <!--End of body content-->

        <footer id="footer">
            Contents © 2017         <a href="mailto:muxuezi@gmail.com">Tao Junjie</a> - Powered by         <a href="https://getnikola.com" rel="nofollow">Nikola</a>         
<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0">
<img alt="Creative Commons License BY-NC-SA" style="border-width:0; margin-bottom:12px;" src="http://i.creativecommons.org/l/by-nc-sa/4.0/80x15.png"></a>
            
        </footer>
</div>
</div>


            <script src="../assets/js/all-nocdn.js"></script><script>$('a.image-reference:not(.islink) img:not(.islink)').parent().colorbox({rel:"gal",maxWidth:"100%",maxHeight:"100%",scalePhotos:true});</script><!-- fancy dates --><script>
    moment.locale("en");
    fancydates(0, "YYYY-MM-DD HH:mm");
    </script><!-- end fancy dates --><script>
  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
  })(window,document,'script','//www.google-analytics.com/analytics.js','ga');

  ga('create', 'UA-51330059-1', 'auto');
  ga('send', 'pageview');

</script>
</body>
</html>
